{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[View the runnable example on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/training/pytorch/accelerate_pytorch_training_ipex.ipynb)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Accelerate PyTorch Training using Intel® Extension for PyTorch*"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Intel® Extension for PyTorch*](https://github.com/intel/intel-extension-for-pytorch) (also known as IPEX) can boost performance on Intel hardware with AVX-512 Vector Neural Network Instructions (AVX512 VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs. By using `TorchNano` (`bigdl.nano.pytorch.TorchNano`), you can make very few code changes to accelerate training loops via IPEX. Here we provide __2__ ways to achieve this: A) subclass `TorchNano` or B) use `@nano` decorator. You can choose the appropriate one depending on your (preferred) code structure."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "nbsphinx": "hidden"
   },
   "source": [
    "## Prepare Environment for BigDL-Nano"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "nbsphinx": "hidden"
   },
   "source": [
    "At first, you need to install BigDL-Nano for PyTorch:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbsphinx": "hidden"
   },
   "outputs": [],
   "source": [
    "!pip install --pre --upgrade bigdl-nano[pytorch] # install the nightly-built version\n",
    "!source bigdl-nano-init # set environment variables"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 📝 **Note**\n",
    ">\n",
    "> Before starting your PyTorch application, it is highly recommended to run `source bigdl-nano-init` to set several environment variables based on your current hardware. Empirically, these variables will greatly improve performance for most PyTorch applications on training workloads."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbsphinx": "hidden"
   },
   "source": [
    "> ⚠️ **Warning**\n",
    "> \n",
    "> For Jupyter Notebook users, we recommend to run the commands above, especially `source bigdl-nano-init` before jupyter kernel is started, or some of the optimizations may not take effect."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "nbsphinx": "hidden"
   },
   "source": [
    "## Pre-define Model and Dataloader"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "nbsphinx": "hidden"
   },
   "source": [
    "In this guide, we take the fine-tuning of a [ResNet-18 model](https://pytorch.org/vision/main/models/generated/torchvision.models.resnet18.html) on [OxfordIIITPet dataset](https://pytorch.org/vision/main/generated/torchvision.datasets.OxfordIIITPet.html) as an example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbsphinx": "hidden"
   },
   "outputs": [],
   "source": [
    "# Define model and dataloader\n",
    "\n",
    "from torch import nn\n",
    "from torchvision.models import resnet18\n",
    "\n",
    "class MyPytorchModule(nn.Module):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.model = resnet18(pretrained=True)\n",
    "        num_ftrs = self.model.fc.in_features\n",
    "        # Here the size of each output sample is set to 37.\n",
    "        self.model.fc = nn.Linear(num_ftrs, 37)\n",
    "\n",
    "    def forward(self, x):\n",
    "        return self.model(x)\n",
    "\n",
    "\n",
    "import torch\n",
    "from torchvision import transforms\n",
    "from torchvision.datasets import OxfordIIITPet\n",
    "from torch.utils.data.dataloader import DataLoader\n",
    "\n",
    "def create_train_dataloader():\n",
    "    train_transform = transforms.Compose([transforms.Resize(256),\n",
    "                                          transforms.RandomCrop(224),\n",
    "                                          transforms.RandomHorizontalFlip(),\n",
    "                                          transforms.ColorJitter(brightness=.5, hue=.3),\n",
    "                                          transforms.ToTensor(),\n",
    "                                          transforms.Normalize([0.485, 0.456, 0.406],\n",
    "                                                               [0.229, 0.224, 0.225])])\n",
    "\n",
    "    # apply data augmentation to the train_dataset\n",
    "    train_dataset = OxfordIIITPet(root=\"/tmp/data\", transform=train_transform, download=True)\n",
    "\n",
    "    # prepare data loader\n",
    "    train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)\n",
    "\n",
    "    return train_dataloader"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A) Subclass `TorchNano`"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, two steps are required if you choose to subclass `TorchNano`:\n",
    "\n",
    "1) import and subclass `TorchNano`, and override its `train()` method\n",
    "2) instantiate it with setting `use_ipex=True` , then call the `train()` method\n",
    "\n",
    "For step 1, you can refer to [this page](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Training/PyTorch/convert_pytorch_training_torchnano.html) to achieve it (for consistency, we use the same model and dataset as an example). Supposing that you've already got a well-defined subclass `MyNano`, below line will instantiate it with enabling IPEX, and call its `train()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbsphinx": "hidden"
   },
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "from bigdl.nano.pytorch import TorchNano # import TorchNano\n",
    "\n",
    "# subclass TorchNano and override its train method\n",
    "class MyNano(TorchNano):\n",
    "    def train(self):\n",
    "        # Move the code for your custom training loops inside the train method\n",
    "        model = MyPytorchModule()\n",
    "        optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)\n",
    "        loss_fuc = torch.nn.CrossEntropyLoss()\n",
    "        train_loader = create_train_dataloader()\n",
    "\n",
    "        # call setup method to set up model, optimizer(s),\n",
    "        # and dataloader(s) for accelerated training\n",
    "        model, optimizer, train_loader = self.setup(model, optimizer, train_loader)\n",
    "        num_epochs = 5\n",
    "\n",
    "        for epoch in range(num_epochs):\n",
    "\n",
    "            model.train()\n",
    "            train_loss, num = 0, 0\n",
    "            with tqdm(train_loader, unit=\"batch\") as tepoch:\n",
    "                for data, target in tepoch:\n",
    "                    tepoch.set_description(f\"Epoch {epoch}\")\n",
    "                    optimizer.zero_grad()\n",
    "                    output = model(data)\n",
    "                    loss = loss_fuc(output, target)\n",
    "                    # Replace loss.backward() with self.backward(loss)\n",
    "                    self.backward(loss)\n",
    "                    optimizer.step()\n",
    "                    loss_value = loss.sum()\n",
    "                    train_loss += loss_value\n",
    "                    num += 1\n",
    "                    tepoch.set_postfix(loss=loss_value)\n",
    "            print(f'Train Epoch: {epoch}, avg_loss: {train_loss / num}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "MyNano(use_ipex=True).train()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; _The detailed definition of_ `MyNano` _can be found in the_ [runnable example](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/training/pytorch/accelerate_pytorch_training_ipex.ipynb)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## B) Use `@nano` decorator"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`@nano` decorator is very friendly since you can only add 2 new lines (import it and wrap the training function) and enjoy the features brought by BigDL-Nano if you have already defined a PyTorch training function with a model, optimizers, and dataloaders as parameters. You can learn the usage and notes of it from [here](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Training/PyTorch/use_nano_decorator_pytorch_training.html). The only difference when using IPEX is that you should specify the decorator as `@nano(use_ipex=True)`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "from bigdl.nano.pytorch import nano # import nano decorator\n",
    "\n",
    "@nano(use_ipex=True) # apply the decorator to the training loop\n",
    "def training_loop(model, optimizer, train_loader, num_epochs, loss_func):\n",
    "\n",
    "    for epoch in range(num_epochs):\n",
    "\n",
    "        model.train()\n",
    "        train_loss, num = 0, 0\n",
    "        with tqdm(train_loader, unit=\"batch\") as tepoch:\n",
    "            for data, target in tepoch:\n",
    "                tepoch.set_description(f\"Epoch {epoch}\")\n",
    "                optimizer.zero_grad()\n",
    "                output = model(data)\n",
    "                loss = loss_func(output, target)\n",
    "                loss.backward()\n",
    "                optimizer.step()\n",
    "                loss_value = loss.sum()\n",
    "                train_loss += loss_value\n",
    "                num += 1\n",
    "                tepoch.set_postfix(loss=loss_value)\n",
    "            print(f'Train Epoch: {epoch}, avg_loss: {train_loss / num}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbsphinx": "hidden"
   },
   "outputs": [],
   "source": [
    "model = MyPytorchModule()\n",
    "optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)\n",
    "loss_func = torch.nn.CrossEntropyLoss()\n",
    "train_loader = create_train_dataloader()\n",
    "\n",
    "training_loop(model, optimizer, train_loader, num_epochs=5, loss_func=loss_func)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; _A runnable example including this_ `training_loop` _can be seen from_ [here](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/training/pytorch/accelerate_pytorch_training_ipex.ipynb)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 📚 **Related Readings**\n",
    "> \n",
    "> - [How to install BigDL-Nano](https://bigdl.readthedocs.io/en/latest/doc/Nano/Overview/install.html)\n",
    "> - [How to convert your PyTorch training loop to use TorchNano for acceleration](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Training/PyTorch/convert_pytorch_training_torchnano.html)\n",
    "> - [How to accelerate your PyTorch training loop with \\@nano decorator](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Training/PyTorch/use_nano_decorator_pytorch_training.html)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0 (default, Jun 28 2018, 13:15:42) \n[GCC 7.2.0]"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "8772eaeb16382a2d9dbb95ffcb3882976733f8dc8a0780f3e0ca9a3a7dc812c0"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
